15 research outputs found
Do We Train on Test Data? Purging CIFAR of Near-Duplicates
The CIFAR-10 and CIFAR-100 datasets are two of the most heavily benchmarked
datasets in computer vision and are often used to evaluate novel methods and
model architectures in the field of deep learning. However, we find that 3.3%
and 10% of the images from the test sets of these datasets have duplicates in
the training set. These duplicates are easily recognizable by memorization and
may, hence, bias the comparison of image recognition techniques regarding their
generalization capability. To eliminate this bias, we provide the "fair CIFAR"
(ciFAIR) dataset, where we replaced all duplicates in the test sets with new
images sampled from the same domain. We then re-evaluate the classification
performance of various popular state-of-the-art CNN architectures on these new
test sets to investigate whether recent research has overfitted to memorizing
data instead of learning abstract concepts. We find a significant drop in
classification accuracy of between 9% and 14% relative to the original
performance on the duplicate-free test set. The ciFAIR dataset and pre-trained
models are available at https://cvjena.github.io/cifair/, where we also
maintain a leaderboard.Comment: Journal of Imagin
Automatic Query Image Disambiguation for Content-Based Image Retrieval
Query images presented to content-based image retrieval systems often have
various different interpretations, making it difficult to identify the search
objective pursued by the user. We propose a technique for overcoming this
ambiguity, while keeping the amount of required user interaction at a minimum.
To achieve this, the neighborhood of the query image is divided into coherent
clusters from which the user may choose the relevant ones. A novel feedback
integration technique is then employed to re-rank the entire database with
regard to both the user feedback and the original query. We evaluate our
approach on the publicly available MIRFLICKR-25K dataset, where it leads to a
relative improvement of average precision by 23% over the baseline retrieval,
which does not distinguish between different image senses.Comment: VISAPP 2018 paper, 8 pages, 5 figures. Source code:
https://github.com/cvjena/ai
Hierarchy-based Image Embeddings for Semantic Image Retrieval
Deep neural networks trained for classification have been found to learn
powerful image representations, which are also often used for other tasks such
as comparing images w.r.t. their visual similarity. However, visual similarity
does not imply semantic similarity. In order to learn semantically
discriminative features, we propose to map images onto class embeddings whose
pair-wise dot products correspond to a measure of semantic similarity between
classes. Such an embedding does not only improve image retrieval results, but
could also facilitate integrating semantics for other tasks, e.g., novelty
detection or few-shot learning. We introduce a deterministic algorithm for
computing the class centroids directly based on prior world-knowledge encoded
in a hierarchy of classes such as WordNet. Experiments on CIFAR-100, NABirds,
and ImageNet show that our learned semantic image embeddings improve the
semantic consistency of image retrieval results by a large margin.Comment: Accepted at WACV 2019. Source code:
https://github.com/cvjena/semantic-embedding
Towards Automatic Identification of Elephants in the Wild
Identifying animals from a large group of possible individuals is very
important for biodiversity monitoring and especially for collecting data on a
small number of particularly interesting individuals, as these have to be
identified first before this can be done. Identifying them can be a very
time-consuming task. This is especially true, if the animals look very similar
and have only a small number of distinctive features, like elephants do. In
most cases the animals stay at one place only for a short period of time during
which the animal needs to be identified for knowing whether it is important to
collect new data on it. For this reason, a system supporting the researchers in
identifying elephants to speed up this process would be of great benefit. In
this paper, we present such a system for identifying elephants in the face of a
large number of individuals with only few training images per individual. For
that purpose, we combine object part localization, off-the-shelf CNN features,
and support vector machine classification to provide field researches with
proposals of possible individuals given new images of an elephant. The
performance of our system is demonstrated on a dataset comprising a total of
2078 images of 276 individual elephants, where we achieve 56% top-1 test
accuracy and 80% top-10 accuracy. To deal with occlusion, varying viewpoints,
and different poses present in the dataset, we furthermore enable the analysts
to provide the system with multiple images of the same elephant to be
identified and aggregate confidence values generated by the classifier. With
that, our system achieves a top-1 accuracy of 74% and a top-10 accuracy of 88%
on the held-out test dataset.Comment: Presented at the AI for Wildlife Conservation (AIWC) 2018 workshop in
Stockholm (https://sites.google.com/a/usc.edu/aiwc/home
Maximally Divergent Intervals for Anomaly Detection
We present new methods for batch anomaly detection in multivariate time
series. Our methods are based on maximizing the Kullback-Leibler divergence
between the data distribution within and outside an interval of the time
series. An empirical analysis shows the benefits of our algorithms compared to
methods that treat each time step independently from each other without
optimizing with respect to all possible intervals.Comment: ICML Workshop on Anomaly Detectio
Detecting Regions of Maximal Divergence for Spatio-Temporal Anomaly Detection
Automatic detection of anomalies in space- and time-varying measurements is
an important tool in several fields, e.g., fraud detection, climate analysis,
or healthcare monitoring. We present an algorithm for detecting anomalous
regions in multivariate spatio-temporal time-series, which allows for spotting
the interesting parts in large amounts of data, including video and text data.
In opposition to existing techniques for detecting isolated anomalous data
points, we propose the "Maximally Divergent Intervals" (MDI) framework for
unsupervised detection of coherent spatial regions and time intervals
characterized by a high Kullback-Leibler divergence compared with all other
data given. In this regard, we define an unbiased Kullback-Leibler divergence
that allows for ranking regions of different size and show how to enable the
algorithm to run on large-scale data sets in reasonable time using an interval
proposal technique. Experiments on both synthetic and real data from various
domains, such as climate analysis, video surveillance, and text forensics,
demonstrate that our method is widely applicable and a valuable tool for
finding interesting events in different types of data.Comment: Accepted by TPAMI. Examples and code:
https://cvjena.github.io/libmaxdiv
Self-Supervised Learning from Semantically Imprecise Data
Learning from imprecise labels such as animal or bird, but making precise predictions like snow bunting at inference time is an important capability for any classifier when expertly labeled training data is scarce. Contributions by volunteers or results of web crawling lack precision in this manner, but are still valuable. And crucially, these weakly labeled examples are available in larger quantities for lower cost than high-quality bespoke training data. CHILLAX, a recently proposed method to tackle this task, leverages a hierarchical classifier to learn from imprecise labels. However, it has two major limitations. First, it does not learn from examples labeled as the root of the hierarchy, e.g., object. Second, an extrapolation of annotations to precise labels is only performed at test time, where confident extrapolations could be already used as training data. In this work, we extend CHILLAX with a self-supervised scheme using constrained semantic extrapolation to generate pseudo-labels. This addresses the second concern, which in turn solves the first problem, enabling an even weaker supervision requirement than CHILLAX. We evaluate our approach empirically, showing that our method allows for a consistent accuracy improvement of 0.84 to 1.19 percent points over CHILLAX and is suitable as a drop-in replacement without any negative consequences such as longer training times